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Abstract. In this paper we describe a new error-correcting code (ECC) 
inspired by the Naccache-Stern cryptosystem. While by far less efficient 
than Turbo codes, the proposed ECC happens to be more efficient than 
some established ECCs for certain sets of parameters. 

The new ECC adds an appendix to the message. The appendix is the 
modular product of small primes representing the message bits. The 
receiver recomputes the product and detects transmission errors using 
modular division and lattice reduction. 


1 Introduction 

Error-correcting codes (ECCs) are essential to ensure reliable communi¬ 
cation. ECCs work by adding redundancy which enables detecting and 
correcting mistakes in received data. This extra information is, of course, 
costly and it is important to keep it to a minimum: there is a trade-off be¬ 
tween how much data is added for error correction purposes (bandwidth), 
and the number of errors that can be corrected (correction capacity). 

Shannon showed [13] in 1948 that it is in theory possible to encode 
messages with a minimal number of extra bits^. Two years later, Ham¬ 
ming [7] proposed a construction inspired by parity codes, which provided 
both error detection and error correction. Subsequent research saw the 
emergence of more efficient codes, such as Reed-Muller [8, 10] and Reed- 
Solomon [11], The latest were generalized by Goppa [6]. These codes are 
known as algebraic-geometric codes. 

Shannon’s theorem states that the best achievable expansion rate is 1 — 7/2 (Pi,), 
where H 2 is binary entropy and pb is the acceptable error rate. 



Convolutional codes were first presented in 1955 [4], while recursive 
systematic convolutional codes [1] were introduced in 1991. Turbo codes 
[ 1 ] were indeed revolutionary, given their closeness to the channel capacity 
(“near Shannon limit”). 

Results: This paper presents a new error-correcting code, as well as a form 
of message size improvement based on the hybrid use of two ECCs one 
of which is inspired by the Naccache-Stern (NS) cryptosystem [2,9]. For 
some codes and parameter choices, the resulting hybrid codes outperform 
the two underlying ECCs. 

The proposed ECC is unusual because it is based on number theory 
rather than on binary operations. 

2 Preliminaries 
2.1 Notations 

Let fp = {pi = 2,... } be the ordered set of prime numbers. Let 7 > 2 be 
an encoding base. For any m G N (the “message”), let {rrii} be the digits 
of m in base 7 i.e.: 


k-l 

m = ^ 7 *mi ruj G [ 0,7 — 1 ], k = |'log..y m] 
i=0 

We denote by h{x) the Hamming weight of x, i.e. the sum of x’s digits in 
base 2 , and, by \y\ the bit-length of y. 

2.2 Error-Correcting Codes 

Let M = {0,1}^ be the set of messages, C = {0,1}” the set of encoded 
messages. Let P be a parameter set. 

Definition 1 (Error-Correcting Code). An error-correcting code is 
a couple of algorithms: 

— An algorithm p, taking as input some message m £ Ai, as well as 
some public parameters params G V, and outputting c € C. 

— An algorithm , taking as input c € C as well as parameters params G 
V, and outputting m G Ai U {-L}. 

The T symbol indicates that decoding failed. 


Definition 2 (Correction Capacity). Let {fj,, , A4,C,V) be an error- 

correcting code. There exists an integer t > 0 and some parameters 
params G V such that, for all e G {0, !}”■ such that h{e) < t, 

(m, params) 0 e, params) = m, Vm G M 

and for all e such that h{e) > t, 

{jj, (m, params) © e, params) / m, Vm G M. 

t is called the correction capacity of {g,, 

Definition 3. A code of message length k, of codeword length n and with 
a correction capacity t is called an (n, k, t)-code. The ratio p = ^ is called 
the code’s expansion rate. 

3 A New Error-Correcting Code 

Consider in this section an existing (n, A:, t)-code C = {p, 

For instance C can be a Reed-Muller code. We describe how the new 
(n'j/c, t)-code C = , M.,C',V') is constructed. 

Parameter Generation: To correct t errors in a /c-bit message, we generate 
a prime p such that: 

2-pf<p<A-pf (1) 

As we will later see, the size of p is obtained by bounding the worst case 
in which all errors affect the end of the message, p is a part of V' . 

Encoding: Assume we wish to transmit a /c-bit message m over a noisy 
channel. Let 7 = 2 so that m, denote the i-th bit of m, and define: 

k 

c(m) := mod p ( 2 ) 

i=l 

The integer generated by Equation (2) is encoded using C to yield 
p{c{m)). Finally, the encoded message v[m) transmitted over the noisy 
channel is defined as: 

p{m) := m||p(c(m)) (3) 

Note that, if we were to use C directly, we would have encoded m (and 
not c). The value c is, in most practical situations, much shorter than m. 
As is explained in Section 3.1, c is smaller than m (except the cases 
in which m is very small and which are not interesting in practice) and 
thereby requires fewer extra bits for correction. For appropriate parameter 
choices, this provides a more efficient encoding, as compared to C. 


Decoding: Let a be the received^ message. Assume that at most t errors 
occurred during transmission: 


a = v{m) © e = m'||(/x(c(m)) © e') 

where the error vector e is such that h{e) = h{m' © m) + h{e') < t. 

Since c{m) is encoded with a t-error-capacity code, we can recover the 
correct value of c(m) from © e' and compute the quantity: 


dm) 

s = —-—- mod p 
c[m) 


( 4 ) 


Using Equation (2) s can be written as: 


s = 


a 

b 


mod p, 


a = n Pi 

(m'=l)A(mi=0) 

b = n Pi 

(m'=0)A(mi=l) 


( 5 ) 


Note that since h{m' © m) < t, we have that a and b are strictly 
smaller than {pkY. Theorem 1 from [5] shows that given t the receiver 
can recover a and b efficiently using a variant of Gauss’ algorithm [14]. 


Theorem 1. Let a,b € Z such that —A < a < A and < b < B. Let p 
be some prime integer such that 2AB < p. Let s = a ■ b~^ mod p. Then 
given A, B, s and p, a and b can be recovered in polynomial time. 

As 0 < a < A and 0 < b < B where A = B = {pkY — 1 and 2AB < p 
from Equation (1), we can recover a and b from t in polynomial time. 
Then, by testing the divisibility of a and b with respect to the small 
primes pi, the receiver can recover m' © m and eventually m. 

A numerical example is given in Appendix A. 


Bootstrapping: Note that instead of using an existing code as a sub¬ 
contractor for protecting c{m), the sender may also recursively apply the 
new scheme described above. To do so consider c(m) as a message, and 
protect c = c(c(- • • c(c(m))), which is a rather small value, against acci¬ 
dental alteration by replicating it 2t + 1 times. The receiver will use a 
majority vote to detect the errors in c. 

® i.e. encoded and potentially corrupted 





3.1 Performance of the New Error-Correcting Code for 7 = 2 
Lemma 1. The bit-size of c{m) is: 


\ 0 g 2 P ^ 2 ■ tlog2{klnk). ( 6 ) 

Proof. From Equation (1) and the Prime Number Theorem®. □ 

The total output length of the new error-correcting code is therefore 
log 2 p, plus the length k of the message m. 

C outperforms the initial error correcting code C if, for equal error 
capacity t and message length k, it outputs a shorter encoding, which 
happens if n' < re, keeping in mind that both re and n' depend on k. 

Corollary 1. Assume that there exists a constant 5 > 1 such that, for k 
large enough, n{k) > 5k. Then for k large enough, n'{k) < n{k). 

Proof. Let k be the size of m and k' be the size of c{m). 

We have n'{k) = k + n{k'), therefore 

re(/c) — n'{k) = n{k) — {k -\- n{k')) > (5 — l)k — n(k'). 

Now, 

(<5 - l)k - n(k') >0^(5-l)k> n(k'). 

But n{k') > 5k', hence 

k'5 


{5 - l)k >5k'^k> 


{5-iy 


Finally, from Lemma 1, k' = 0(lnln/c!), which guarantees that there 
exists a value of k above which n'{h) < n{k). □ 

In other terms, any correcting code whose encoded message size is 
growing linearly with message size can benefit from the described con¬ 
struction. 

Expansion Rate: Let k be the length of m and consider the bit-size of the 
corresponding codeword as in Equation ( 6 ). The expansion rate p is: 


^ ^ |m||/x(c(Tre))| ^ k+\p{c{m))\ ^ ^ \p[c{m))\ 


rre 


k 


k 


( 7 ) 


Pfe ~ fc In k. 








Fig. 1. Illustration of Corollary 1. For large enough values of k, the new 
ECC uses smaller codewords as compared to the underlying ECC. 


Reed-Muller Codes We illustrate the idea with Reed-Muller codes. 
Reed-Muller (R-M) codes are a family of linear codes. Let r > 0 be an 
integer, and N = log 2 re, it can apply to messages of size 


^ = E 

i=l 



( 8 ) 


Such a code can correct up to t = — 1 errors. Some examples 

of {n,k,t} triples are given in Table 1. For instance, a message of size 163 
bits can be encoded as a 256-bit string, among which up to 7 errors can 
be corrected. 


n 

16 

64 

128 

256 

512 

2048 

8192 

32768 

131072 

k 

11 

42 

99 

163 

382 

1024 

5812 

9949 

65536 

t 

1 

3 

3 

7 

7 

31 

31 

255 

255 


Table 1. Examples of length re, dimension k, and error capacity t for 
Reed-Muller code. 


To illustrate the benefit of our approach, consider a 5812-bit message, 
which we wish to protect against up to 31 errors. 





















A direct use of Reed-Muller would require n(5812) = 8192 bits as 
seen in Table 1. Contrast this with our code, which only has to protect 
c(m), that is 931 bits as shown by Equation (6), yielding a total size of 
5812 + n(931) = 5812 + 2048 = 7860 bits. 

Other parameters for the Reed-Muller primitive are illustrated in Ta¬ 
ble 2. 


ri 

638 

7860 

98304 

k 

382 

5812 

65536 

c{m) 

157 

931 

9931 

RM(c(m)) 

256 

2048 

32768 

t 

7 

31 

255 


Table 2. (n, k, t)-codes generated from Reed-Muller by our construction. 


Table 2 shows that for large message sizes and a small number of 
errors, our error-correcting code slightly outperforms Reed-Muller code. 

3.2 The case 7 > 2 

The difficulty in the case 7 > 2 stems from the fact that a binary error 
in a 7-base message will in essence scramble all digits preceding the error. 
As an example, 

12200210122020120100111202023-^2^° = 12200210 22112000112220110110-. 

Hence, unless 7 = 2^ for some T, a generalization makes sense only for 
channels over which transmission uses 7 symbols. In such cases, we have 
the following; a A:-bit message m is pre-encoded as a 7-base K-symbol 
message m'. Here k = [A;/log2 7]. Equation (1) becomes: 

2 . p2i(7-i) < p < 4 . p2i(7-i) 

Comparison with the binary case is complicated by the fact that here t 
refers to the number of any errors regardless their semiologic meaning. In 
other words, an error transforming a 0 into a 2 counts exactly as an error 
transforming 0 into a 1. 

Example i. As a typical example, for t = 7, k = 10° and 7 = 3, = 

15485863 and p is a 690-bit number. 

Eor the sake of comparison, t = 7, k = 1584963 (corresponding to 
K = 10°) and 7 = 2, yield pk = 25325609 and a 346-bit p. 













4 Improvement Using Smaller Primes 


The construction described in the previous section can be improved by 
choosing a smaller prime p, but comes at a price; namely decoding be¬ 
comes only heuristic. 


Parameter Generation: The idea consists in generating a prime p smaller 
than before. Namely, we generate a p satisfying : 

<p<2“+Up* (9) 

for some small integer rt > 1. 


Encoding and Decoding: Encoding remains as previously. The redundancy 
c{m) being approximately half as small as the previous section’s one, we 
have : 


s 


a 

b 


mod p, 


{ a = n Pi 

(m'=l)A(mi=0) 

b = n Pi 

(m'=0)A(mi=l) 


( 10 ) 


and since there are at most t errors, we must have : 


a-b< {pkf 


( 11 ) 


We define a finite sequence {Ai,Bi] of integers such that Ai = 2“'* and 
Bi = Ylp/Ai\. From Equations (9) and (11) there must be at least one 
index i such that 0 < o < 4.* and 0 < b < Bi. Then using Theorem 1, 
given Ai, Bi, p and s, the receiver can recover a and b, and eventually m. 

The problem with that approach is that we lost the guarantee that 
{a,b} is unique. Namely we may find another {a',b'} satisfying Equa¬ 
tion (10) for some other index i'. We expect this to happen with negligible 
probability for large enough u, but this makes the modified code heuristic 
(while perfectly implementable for all practical purposes). 


4.1 Performance 

Lemma 2. The bit-size of c{m) is: 

log 2 P — u-\-tlog 2 {klnk). (12) 

Proof. Using Equation (9) and the Prime Number Theorem. □ 


Thus, the smaller prime variant has a shorter c{m). 

As u is a small integer {e.g. u = 50), it follows immediately from 
Equation (1) that, for large n and t, the size of the new prime p will 
be approximately half the size of the prime p generated in the preceding 
section. 

This brings down the minimum message size k above which our con¬ 
struction provides an improvement over the bare underlying correcting 
code. 

Note: In the case of Reed-Muller codes, this variant provides no improve¬ 
ment over the technique described in Section 3 for the following reasons: 
(1) by design, Reed-Muller codewords are powers of 2; and (2) Equa¬ 
tion (12) cannot yield a twofold reduction in p. Therefore we cannot hope 
to reduce p enough to get a smaller codeword. 

That doesn’t preclude other codes to show benefits, but the authors 
did not look for such codes. 

5 Prime Packing Encoding 

It is interesting to see whether the optimization technique of [2] yields 
more efficient ECCs. Recall that in [2], the piS are distributed amongst 
K packs. Information is encoded by picking one pi per pack. This has an 
immediate impact on decoding: when an error occurs and a symbol a is 
replaced by a symbol a', both the numerator and the denominator of s 
are affected by additional prime factors. 

Let C = {pL,iJL~^,M.,C,V) be a Terror capacity code, such that it is 
possible to efficiently recover c from p,{c) 0 e for any c and any e, where 
h{e) < t. Let 7 > 2 be a positive integer. 

Before we proceed, we define k := [A:/log2 7] and 

k 

f ■= = n P'y^- 

i=k—t 

Parameter Generation: Let p be a prime number such that: 

2-f<p< A-f (13) 

Let C = A4 X Zp and P = (P U ip) x N. We now construct a variant of 
the ECC presented in Section 3 from C and denote it 

C= (^iy,iy-\M,C,P^ . 


Encoding: We define the “redundancy” of a A:-bit message m a Ai (rep¬ 
resented as K digits in base 7 ) by: 


K—l 

c(m) := Pi^+rrii+i mod p 
i=0 

A message m is encoded as follows: 

n{m) := m\\fi (c (m)) 

Decoding: The received information a differs from v{m) by a certain 
number of bits. Again, we assume that the number of these differing bits 
is at most t. Therefore a = v{m) 0 e, where h{e) < t. Write e = emWet 
such that 

a = i/(m) 0 e = m © em\\n{c{m)) 0 eg = m'\\p{c{m)) © eg. 

Since h{e) = h{em) + h{ec) < t, the receiver can recover efficiently 
c(m) from a. It is then possible to compute 


s := 


c{m') 

c{m) 


mod p 


K—1 

Pi^+m[+l 
i=0 _ 

K—1 

Pi^+rui+l 

i=0 


mod p. 


s = 


a 

b 


mod p, 


® Pi'Y+m[+l 

rrii^m'^ 


(14) 


As h{e) = h{em) + h(ec) < t, we have that a and b are strictly smaller 
than /(y, As A = B = /(y, — 1, we observe from Equation (13) 

that 2AB < p. We are now able to recover a, b, gcd(a, b) = 1 such that 
s = ajb mod p using lattice reduction [14], 

Testing the divisibility of a and 6 by ,..., p^^y the receiver can re¬ 
cover Cm = m' © m, and from that get m = m' 0 Cm- Note that by 
construction only one prime amongst 7 is used per “pack”: the receiver 
can therefore skip on average 7/2 primes in the divisibility testing phase. 






5.1 Performance 


Rosser’s theorem [3,12] states that for n > 6, 

Pn 

Inn + In In n — 1 < — < In n + In In n 
n 

i.e. Pn < n(lnn + In Inn). Hence a crude upper bound of p is 

V < 4/(k,7,*)^ 

=4 n 

K 

— n (*7(lni7 + lnln(i7)))^ 

i=K—t 

- ) (In n7 + In In 

Again, the total output length of the new error-correcting code is 

n' = /c -|- |p|. 

Plugging 7 = 3, n = 10® and t = 7 into Equation (13) we get a 
410-bit p. This improves over Example 1 where p was 690 bits long. 
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A Toy Example 


Let m be the 10-bit message 1100100111. For t = 2, we let p be the 
smallest prime number greater than 2 • 29^, i.e. p = 707293. We generate 
the redundancy: 

c(m) = 2^ • 3^ • 5° • 7° • 11^ • 13° • 17° • 19^ • 23^ • 29^ mod 707293 

^ c(m) = 836418 mod 707293 = 129125. 

As we focus on the new error-correcting code we simply omit the Reed- 
Muller component. The encoded message is 


v{m) = 110010011 l 2 ||l 29125 io. 


Let the received encoded message be a = 110010101 l 2 ||l 29125 io. Thus, 
c(m') = 2^ • 3^ • 5° • 7° • 11^ • 13° • 17^ • 19° • 23^ • 29^ mod p 


=> c{m') = 748374 mod 707293 = 41081. 
Dividing by c(m) we get 


c(m') 

c{m) 


41081 

129125 


mod 707293 = 632842 


Applying the rationalize and factor technique we obtain ~ Yg 

707293. It follows that m! ®m = 0000001100. Flipping the bits retrieved 
by this calculation, we recover m. 



